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REMARKS 

Claims 1-44 are pending in the application. Claims 1,12, 1 7, and 23 have been 
amended. No claims have been allowed. Applicants respectfully request entry of the 
foregoing amendments arid allowance of the application as amended. 

Petition for extension of time 

A petition for a three month extension of time under 37 C.F.R. § 1.136(a) i$ 
included herewith, as well as the fee under 37 C.F.R. § 1.1 17(a)(3). 

Double Patenting Rejection 

Claims 1-44 were rejected under the judicially created doctrine of obviousness- 
type double patenting as being unpatentable over claims 1-39 of U.S. Patent 
application publication number 2003/0228023 (U.S. Patent Application No. 
10/400,282). Applicants submit herewith a Terminal Disclaimer in compliance with 37 
C.F.R. § 1.130(c), as well as the fee under C.F.R. § 1.20(d). Applicants respectfully 
request withdrawal of the rejection. 

Rejections under 35 US. G§102 

Claims ,1-3,5-18, 20-30, and 33-34 were rejected under 35 U.S.C. § 102(b) as 
being anticipated by Holzrichter (U.S. patent number 5,729,694, hereinafter 
"Holzrichter"). Applicants respectfully traverse the rejection. 
The examiner states in paragraph 5 of the Office action: 



il 
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5. CtaihYW, 5-1=8, and 33-44 are rejected under 35 US C. 102(b) as being 
anticipated by HoJzrfchtet (US PAT. 5,729,694). 

Consider clailm 1 Hoterihter leaches 3 method for removing noise from acoustic 
signals, comprising; 

receiving («w %43, 52) a plurality of acoustic signals; 

reoeMng (43) ^formation on tfie vibration of human tissue associated wfch human 
voicing activity; 

generating at least one first transfer functfon (57) representative of the plurality of 
acoustic slgioete upon determin ing that voicing information is absent (such as, unvoice) 
from the plural Ity of acoustic signals for at lestst one specified period of time (such as, 
fine framesXsee co!28 line and 

removing noise (removing noise Is inherent to speech recognition algorithm to 
extract the best speech feature and avoid noise) from the plurality of acoustic slgneJs 
usfng the first transfer (57) function to produce at least one denoised acoustic data 
stream (60, see ool. 15 Una 29-ooL 16 line 3 and co>. 60 fine 19-30). 

Applicants respectfully submit that claims 1-3, 5-18, 20-30, and 33-34 are not 
anticipated by Holzrichter. Holzrichter lacks at least one element of the claims. 
Specifically, Holzrichter lacks at least the following elements: transfer function(s); and 
use of more than one microphone. To explain the differences between the claimed 
invention and Holzrichter, and to support Applicants statement that Holzrichter lacks 
the stated elements, Applicants submit the following ananlysis. 

Applicants submit that while Holzrichter does discuss receiving both acoustic 
(microphone-based) and "EM wave" (col. 15, lines 19) based measurements, 
Holzrichter uses only a single microphone. Further, Holzrichter uses the "EM wave" 
only to measure "the conditions of the vocal folds and the glottal tissue surrounding the 
vocal fold structure" (col. 15, lines 26-28). Applicants do not claim to measure the 
conditi ons of vocal folds or surrounding structure; but rather claim the use of 
physiologically-based device to determine the VAD signal, which is not mentioned in 
Holzrichter. In addition, in contrast to Holzrichter , Applicants claim "receiving a 
plurality of acoustic signals, wherein receiving the plurality of acoustic signals 
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includes recei ving using a plurality of independently located microphones," (Claim 1 
as amended). 

Holzrichter does not generate any transfer function as in (57). Rather, 
Holzrichter generates a Fourier Transfonn, which is a measure of the frequency content 
of a signal. In order to generate a transfer Amotion, two signals (an input and output) 
are needed. Holzrichter has only one signal and simply transforms it into the frequency 
domain, which is completely different from, and exclusive of, taking two signals and 
generating a transfer function. Even the quoted passage at col. 28 lines 38-48 does not 
discuss time frames - it is simply a discussion of how Holzrichter thinks the end of a 
speech period should be calculated when a voiced to unvoiced transition takes place: 

In the case that the speech changes frtttL voiced to 

unvoiced, the last glottal open/dtose period <rf the voiced 
40 speech sequence has no "next" glottal cycle to use to define 

its end of period- In one approach, die algorithm continually 

tests the length of each glottal closed-time la each time 

frame ttat excessive length fcg. 20% longer than the proc- 
eeding glottal period dosiir^tirno). if the period is totted to 
45 be too long, the algorithm terminates the period and assigns, 

fox example, a glottal-closure time-duration equal to the 

fractional closure time of the glottal function measured in 

the preceding lime frames. 

The foregoing is not related to segmenting data into time frames for processing. 

Finally, (57) and (60) along with col. 15 line 29 - col. 16 line 3 and col. 60 
lines 19-30 have nothing to do with removing noise. (57) is a Fourier Transform, as 
noted above, and (60) is a "Vocal tract feature extractor" which is unrelated to present 
claims. Col. 15 lines 29-col 16 line 3 describes Holzrichter's perception of how a 
(t vocal tract Fourier Transform" would appear (col. 15 line 67-col. 16 line 1). This is to 
be distinguished as being completely different from a calculating transfer function as 
claimed. 
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FIG. S shows a system in which knowledge of the 
vocalized excitation rWctiion Js used to deconvolve die so 
speech vocal tract transfer fimctton information from 
wed acoustic speech output each time tae. AH of the 
infcmntloti^tticTc4 during eadi spoccft ttrw ir&ios, in elud- 
ing acoastfes, EM sen&or information And teeo&volved 

auantized, and stored (along with control Mormatloa) in a 
feature vectm:«jwtfiemSisR the s pcaJcer's voice during cm* or 
more speech time francs, Similar dcroim>ivirig jroeediirrs 
are used with dnvoiccd oxdtetfc>n functions. As 3ftown in. 
HG. 5, an EM sensor eon trol u nit 40 drives a repetition talc 40 
trigger 41, whicfe drives pufee generator 42, which ttansmits 
oae or more pulses from antenna 43, EM sensor control unit 
40 sets the pulse format » time frame interval* integration, 
tiroes, mwnory locations, function forms, and controls and 
initializes pulse generator 42. Control unit 40 and trigger 41 45 
also actuate rotfteh 45 through delay 44 to range gate 
reeefredpufees. Antenna 43 is portioned to direct ttawxoit- 
tcd pulses toward? the vocal organs and receive pulses 
reflected thefe£roiiL The received pukes pass through anvitcii 
45 and arc integrated by integrator 4$. Iften amplified by 50 
wnpiiner 47* and pimped through a high pass Alter 45 to a 
procewtag unit 49. Processing unit 49 contain* an AD 
converter for digittelng the EM signals and also .include* 
zero location detector* roenwy detector; and obtains £lomi 
area versus time The digitized 2nd processed data from unit 55 
49 ti Stored in meraety bans 50, from which cxdtatfon 
function feature vectors are formed in block 51* 
Simnltane ensty, signals fern an acoustic iniafeptiOfl* 52 me 
d&gifoed by AD converter 53, utich is also i - uiUuto i and 
syndnofiked by EM aernsor control unit 44. Tiro digitized do 
data from AD converter 53 is itored in memory triDs54 from 
which acoustic feature vectors are formed In block 55. The 
digitized vocal fold data from memory bans 50 is used to 
produce a glottal Fourier transferal 56\ while tan (tigEdecd 
acoustic data in memory bin 54 H used ID produce an & 
acoustic Fourier transform 57. like two Fourier transforntt 
56, S7 arc deconvolved ia Wocjc 53 to produce a vocal tract 

FoxuCcr txansform, 59 which is then fit to a piecno$en 
l\jnc^on^ form to form a 70C*i traa feanare vector In ttock 
«0, 



Finally, in col 60 lines 19-30, Holzrichter discusses removing "acoustically 
generated noise" from the "gtottal signal" using "Fourier transform techniques" (col 
60 lines 19-22). This is not related to the present claims because: 

1 * The noisy signal of Holzrichter i$ a "glottal signal", while the claimed noisy 

signal is an acoustic microphone signal. 
2. Holzrichter does not describe or embody his "Fourier transform techniques", 
so they are not available to be compared to the claimed techniques. 
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Acoustically generated noise can be removed from me 
20 glottal signal by using microphone Monnatioa to subtract 
the noise signal, or by using Fourier transform tedwiijues to 
filter out necrotic signals from the glottal motion signals. 

The functional shape of the votame air How excitation 
function in real time, and in transform space (Fourier or Z 
25 transform), can be apporcttimated* including the glottal z#o 
(or dosed) time, Ail excitation feature vector is constructed 
by defining an appraritnattea fractional (or table) to the 
measured excitation fcnctioa and by obtaining a series of 
numerical coefficient that describe the fractional fitting to 
30 the anmietical data for the defined time ftame($). 



The last part (col. 60 lines 23-30) discusses how to approximate a 'Volume flow 
excitation function" using "functional [sic] (or table)". This has no bearing on the 
claims. For example, the claims do not include a "volume flow excitation function". 
In addition, Applicants respectfully submit that Holzrichter's "functional [sic] (or 
table)" is described with enough specificity to determine what is being referred to. 

The examiner also states: 
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Consider claims 2-3 Hoterftrter teaches the method of removing noise farther 
compTfeos: 

generating at feast one second trantfer function (see flg.5 (56)) reprfesentativB of the 
pJuraMty of acowstte signals upon determining that votelng Infbrmatfon is present In the 
pluraHty of acoustic signals for the at feast one specified perfod of tfrne (such as, time 
frame^Xsee cn\.2& line 38-46), and removing noise (removing noise fs Inherent to 
speech recognition aJgorithm to extract the best speech feature and avoid nofse) ftam 
the pfuralfty of acousflc sfgnafe usfrtg at least one comWnatfen (5S) of the et least one 
first -transfer function (57) and the at least one- second transfer 'function (56) to produce 
at toast one denofced acoustic data stream{ see col. 15 line 29-coL 16 line 3 and col. 60 
fine 19-30}; and the plurality of acoustic signals Include inhewfly (because th» EM 
sensor 43 and microphone 52 picks up the noise source signal and the acoustic signal) 
at least one reflection of at feast one a&socisisd noise source signal and at least one 
reflection of at least one acoustic source signal (see col. 14 fine 46-67 and col. 24 line 
23-61). 

Applicant respectfully submits that Holzrichter only mentions noise removal in 
a single context. Specifically, Holzrichter desires to remove "acoustically generated 
noise" from the "glottal signal" using "Fourier transform techniques" (col. 60 lines 19- 
22). Not once, in any context, does Holzrichter mention the removal of acoustic noise 
from a microphone signal using a second microphone signal. The references to (col 
28 lines 38-48) and (col. 15 lines 29-coI. 16 line 3) and (col. 60 lines 19-30) have been 
shown to be unrelated above. Col. 14 lines 46-67 states: 
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FIGS. 3A and FIG. 3QB show two types of laboratory 
apparatus fftr measuring the simultaneous properties of 
several speech organs using EM senses and for obtaining 
simultaneous acoustic Mormatian. FIG. 3A, in particular, 

so shows highly accurate laboratory instrumentation assembled 
to obtain vary high fidelity; linear; and very large dynamic 
rang© information on the vocal system during each speech 
titoe frame. Ha 3A show a view of a head with three 
antennas 21, 22. 23 and an acoustic microphone 24 moua ted 

55 on a support staled 25. Antennas 21, 22, 2$ are connected to 
pulse generators 26a, b, c through transmit/receivex switches 
27a, h c wspecttvdy* Pulse generator* 26a> b, c apply pulses 
to amertnas 21, 22, 23, which axe directed to various parts of 
the vocal system. Antennas 21, 22> 23 pick up reflected 

so poises, which axe then transmitted bade through switches 
2fa; ft ^ to pulse receivers and digitizers (ag^ sample and 
hold units) 28^ h y c Acoustic ififbirattion from tttaropbeme 
24 is also input into pulse receiver and digitizer 28d. Support 
stand 25 positions the antennas 21, 22, 23 to detect signals 

65 from various parts of the vocal tract, eg*, by using faoe 
positioning structure 29 and chest positioning structure 30* 
As shown* antenna 21 is positioned to detect the tongue, Bp, 



This is a simple description of the multiple antenna and SINGLE microphone 
configuration envisioned by Holzrichten In contrast, Applicants claim the use of 
multiple microphones. 



CoL 24 tines 29-61 state: 
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3) Remove post-glottal pressure induced vibrations of 

30 gktftal tissue and nearby tissue ftoiu Che EM sensor sigaal, 
and therewith from the associated model of volume air Sow 
versus sensor signal. Use one of two related methods. 
Method 3A> Filter the raw EM sensor excitation signal using 
transform or circuit techniques to remove the acoustic 

35 pressure induced higher frequency noise, but preserve the 
needed low frequency excitation function shape information 
lor model gorcated values of volume air flow and for 
$ub$equeat feature vector formation- Method 3B) Use flic 
tissue vibration signal from the EM sensor and the acoustic 

40 output (corrected for timing delays) to determine the back- 
ward acoustic transfer ftmcrion* Divide the Fourier txans- 
fcoanas of foe vibration signal by that of the acoustic signal, 
and store the numerical (or curve fit) transfer function 
information in memory for recall as needed* Next, for each 

45 time frame, use the backward transfer fuuctioii to calculate 
the glottal tissue vibration level associated with the mea- 
sured output acoustic signal. Then subtract the backward 
trausfeired acoustic signal from the EM-sensor generated 
and processed signal* to obtain a "noise free" excitation 

50 fraction signal This signal represents a bactevartf traveling 
acoustic sound wave that induces mechanical vibrations of 
glottal tissue and nearby air tract tissues in directions 
transverse to the air flow. This acoustic wave has little effect 
on the positions of the vocal fold edged, and thus it does not 

55 affect the actual volume air flow tL However, certain EM 
sensors do measure this noise, and it shows up on the EM 
signal describing (he excitation function (see FIG- 4B far act 
example). This noise level is found to be speaker specific 
For higfr fidelity, speaker independent excitation function 

(SO coding, such vibration signals mixed with the gross air flow 
valnes are undesirable. 



In this passage, Holzrichter discusses how he envisions removing "post-glottal 
pressure induced vibrations'" (i.e. the speech of the user) from the "EM sensor signal" 
(col. 24 lines 29-30). Holzrichter lack elements of the claimed invention, and even 
teaches away from the claimed invention. For example: 

1. The claimed signal of interest is the user's speech, whereas the user's speech is 
"noise" in Holzrichter. 

2. Holzrichter' s signal of interest is the "EM sensor signal" which Holzrichter 
desires to associate with volume air flow. 
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3. The claimed noise is environmental acoustic noise, whereas the speech of the 
user is noise in Holzrichter, 
The Office action continues: 

Consider claims 5-8 Holzricfrter teaches that the method of removing notes further 
includes generating at feast one third transfer function (see (59)} using the at feast 
one first transfer function (57) and the at teas* one second transfer function (56); the 
method of generating the at least one first transfer function (see ftg.5 (57)) comprises 
recalculating the at least one first transfer function during at least one prospected 
interval (see col, 19 fine 26-cot 20 line 15); and the method of generafing frs at teast 
one second transfer function (see (56)) comprises recalculating the at teeat one 
second transfer function during at feaet one prespecffled Interval (see col. 19 tota 29-od. 
20 tfne 15); and the method of fleneraGng me at least one first transfer function (see 
%5 (57) comprises ties of at least one technique seFected from a group consisting of 
adaptive techniques end recursive techniques (see col. 19 line 26-col. 20 IFne 15), 

As shown above, (56, 57, and 59) are Fourier transforais, not transfer functions 
as in the claims. Col 19 line 26 - col. 20 line 15 states: 
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The feature vector shown in EKx 12A for the sound /ah/, 
was co&strcietcd using a total of p feature vector coefficients, 
Cj through c p> to describe (be processed data. In this 
example, c x is wed to describe the type of transfer functions 30 
used, eg. M p means Che use of an ARMA functional in the 
'"pole" and "zero* formulation; c a describes the number of 
^odes" and Cg describes the number of "zeros* used for the 
fitting; c 4 indicates the kind of speech uofc being spoken,, e.g. 
"0 s * means isolated phoneme; c 5 describes the type a£ 35 
connection to a preceding acoustic sotrod unit to be used, 
e.& "0* means a connection to the silence phoneme is 
needed; c 6 describes flic connection to the following unit, 
e*g. M (T means a connection to a following silence phoneme 
is needed; Cj describes the 300 ms multi-frame speech 40 
segment envelope; c e is the pitch (e>g., 120 vocal fold 
<yeles/sec); and describes the bandwidth of the funda- 
mental haenkonic. Other feature vector coefficients that 
describe the relative ratios of the 2nd through the XOttt 
harmonic power to the first harmonic, are taken from the 45 
power transform of the vocal excitation (FIG. 10B). Jh 
addition the fall of the harmonic excitation power per 
octave, above 1 kHz. can be described by a line with -12 
db/octove negative slope. The '"pole* 1 artd 4 'zero** coefficient 
data (FKJ. 12B) are shown and stored as appropriate coef- so 
ficlents in the vector in PIG- 12A. the last coefficient is 
the tyinbolforthe soimd* amithent^ 
inforrcatfon from a CASR or similar system which is the 
acoustic energy per frame. If the user desires to use the 
alternative formulation of the ARMA transfer functional* the 55 
"a" and "b" coefficients can be used (see FIG. 12C), 

An alternative approach to describe the feature vector for 
the "long" speech segment /ah/ is to perform Fourier trans* 
tarnations each 8 3 ms {the period for 120 Hz excitation), 
and to Join 3<S individual pitch period frames into a 300 ms 
long multiple frame speech segment A second alternative 
approach would be to take the Fourier transform of the entire 
300 ms segment, since it was tested to be constant; however 
the FFT atgoriOutt would need to handle the large amount of 
data. Because of the constancy of the acoustic phoneme unit 65 
/ah/, the user chose to define t he 300 ms period of constancy 
first, and to then process (i-e., EFT) the repetitive excitation 
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andoutpm aromatic signal with a convcnieirt 10ms period 30 
times, and then average the results. 

As a test (see Secdon below 00 Speech Synthesis) a 
synthetic speech segment was reconsOTcted from infoma- 

3 tloa in a vector like the <me shown in FIG, 12A. The vocal 
fold excitation function was first reconstructed using the 
haguonic amplitude and phase Monaco to generate a 
source term oyer aa interval of 100 ms, The excitation 
fuatfiott Was sampled at 11 kHz or higher. The time sampled 

10 sequence was used to drive the ARMA model specified by 
a difference equation with poles add zeros. The output of the 
AKMAmodel was used to reconstruct the speech sound y^h/ 
as shown in the section on Speech, Synthesis (see FIG, 19), 
and a pleasing sound, /ah/, was generated and heard by the 

IS usee. 



Again as before, Holzrichter envisions calculating a "feature vector" to describe 
the speech excitation (using Fourier transforms and other unnamed techniques), and the 
vocal tract transfer function (a model of the configuration of the vocal tract), and the 
speech itself. None of this is relevant to the claims, which do not calculate, or 
approximate, or model the speech excitation or vocal tract transfer function in any way. 
Holzrichter simply lacks any teaching or suggestion regarding the methods, including 
transfer functions, as claimed. 



Continuing: 

Consider claims 9-1 1 Hohtfchter teache* that the method of Infonratton on the 
vibration OF human tfesiw to provided by a mechanical sensor (such as, motion sensor) 
rn contact with the skin (see figs 3a-3b(29,30,33)) and see col. 14 line 46-cot, 1 5 line 
16); and the method of information on the vfbratton of human tissue is provided via at 
feast one sensor selected from among at least one of an accolerometer, a skin surface 
microphone fin physfcaJ contact wWi skin of a user, a human tissue vtoratfon detector, a 
radio frequency (R..F) vibration detector, and a laser vibration detec*or(see figs 3a- 
3b(2W0,33) and see col- 14 line 4$-coi. 15 Sne 18); andthe human tissue is at least 
one of on a surface of a head, nearthe surface of the head, on a surface of a neck, 
near The surface of the neck, on a surface of a chest, and near the surface of the 
cfts3t{see figa 3a-3b(29 ,30,33)) and see col. 14 line 46-061. 16Rne18)- 
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The quoted passage (col 14 line 46 - col. 15 line 18) does not, in fact, describe 

any mechanical sensor in contact with the skin. Nor does the quoted passage, 

according to Holzrichter, detect vibrations associated with speech. According to 

Holzrichter (col. 14 lines 47-48): 

apparatus for measuring the simttllaooaus p^pfitties oif 
several speech organs using EM sensors and for obtaining 

whereby the speech organs are listed as the (coL 14 line 67 - col. 15 line 3), 
As shown, antenna 21 is positioned to detect tile tongue, Ep 5 

velum, etc- Antenna 22 h positioned to detect tongue and 
jaw motion and antenna 23 is position to detect vocal fold 
motion. 

and "properties" is not explained. Holzrichter simply lacks any teaching or suggestion 
regarding the use of mechanical or EM sensors to detect skin vibration due to user 
speech as is in the claims. 

Continuing: 

Consider claim 12 Holzrichter teaches that a metood for removing noise from 
electronic sfgnate, comprising: 

detecting (see f|g.5, (43. sensor)) an absenoe (urwok»)of voToed Information 

during at least ona period (see col. 28 line 38-48), wherein detecting includes 

measuring the vibration or numan tissue (see col. 5 line 6B-00I. 8 line 55); receiving at 

least one noise scurce signal during the at least one pariod (sea cot* 34 line 29-6 1 ); 

generating at 1 oast one transfer function {57) representative of the at (east one noise 

source signal: receiving at feast one composite signal comprising acoustic and noise 

signals; and removing the nofeo (removing noise ra inherent to speach recognition 

algorithm to extract the best speech feature and avoid noise) eigne) from the at least 

one composite signal using the at least one transfer function to produce at least one 
denolsed acoustic data strcam(60, see col. 15 She 29-col 46 fine 3 and coL GO line 19- 

30), 

As shown above, Holzrichter never mentions generating a transfer function - 
(57) is a Fourier transform, a completely different process. Also, as discussed above, 
Holzrichter's signal of interest and "noise" are completely different from the claimed 
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signal of interest and the claimed noise. Applicants have also shown that Holzrichter 
does not teach detecting vibrations associated with the user's speech. 

Continuing: 

Consider claims 1 6 and 20 Hofartehtef leaches that the method of recsivfng includes { 
receiving the at least one notes source signal using at least one rrtfcrophome (see 
(52)); and tlie method oF removing the nofee signal ftom Eh© at least one composite 
signal using the at least one transfer function (see fig*5 (59)) rncludes generating at 
least one other transfer fcmctlon (57) using the at least one transfer function (see cot, 15 
line 29-coL 16 line 3). 

Again, Holzxichter uses a single microphone, and never considers that the 
microphone signal might be polluted with environmental noise. In contrast, 
environmental noise is at the center of the present claims, Holzrichter never generates 
a transfer function, only using Fourier transforms;, and the only "noise" discussed by 
Holzrichter is the speech of the user. Applicants emphasize that the speech of the user 
is what Applicants invention is intended to keep. Further, Holzrichter does not detail 
the manner in which he would remove the "noise" in any way. 
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Consider claim 23 Hofcrichter teaches a method for removing noise from electronic 
signals, comprfefng; 

determining {see fig.5 (40)) at least: one unvofclng period during which voiced 
Information is absent (such unvotes) based on vlbratfoo of human tissue; 

receiving (43, 52) at least one noise signal input during the at least one unvoicing 
period (see coJ.28 line 38-46) and generating at least one unvoicing transfer 
funclion(56) representative of the at least on© notee signal {see col 24 line 29-61); 

receiving (43.52) at least one composite signal comprising acoustic and noise 
signals; and removmg the nofco sfgneJ (removing no ise Is inherent to speech 
recognition algorithm to extract the best speech feature and avoid notes) from the at 
feast one composite signal udrtg fro at toast unvoting transfer function to produce at 
least one denolsad acoustic d ale stream (60, see col. 15 line 29-ccJ 1 6 line 3 an d col 
BOIIne'19-30). 

Holzrichter fails to teach or suggest removing acoustic environmental noise 
from the user's speech, which is a subject matter of the claimed invention. Figure 5 
(40) is the "EM sensor control unit", which is not enumerated and which has nothing to 
do with determining unvoiced periods. Figure 5 (43) and (52) do not receive noise; 
rather they receive "properties of several speech organs" (col. 14 lines 47-48) and the 
(not-noisy) speech of the user. As shown above, the only "noise" in Holzrichter's 
system is the user's speech, which is the claimed signal of interest. Holzrichter further 
lacks any teaching regarding transfer functions, but rather teaches only the use of 
Fourier transforms. 
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Consider dalm 24 Hoterichterteacnes that tfte method of producing at Je&st one 
denofcred acoustic data stream further includes: 

determining (see %5 (40)) at leasl one voicing period during wnfch voiced 
information is present; receiving (52) a* teas* one ecousGc signal input from at teas* one 
signal sensing devfce during tne at least one voicing period (see coT23 line 36M8) 
and generates at feast one vofclhg transfer function (57) representative of the at least 
one acoustic signal; and removing! the noise signal from the at feast one composite 
signal using at least one combination of \bo at least ono unvoicing transfer tunctron (56) 
and the at least one vofclng transfer function (57) to pioduce the denoised acoustic data 
stream (60. see oofl. 15 line 29-col 16 line 3 and col. 60 line 19-30). 

Holzrichter never discusses producing a denoised acoustic data stream as claimed. 
Instead, Holzrichter discusses "removing 'post-glottal pressure induced vibrations' (i.e. 
the speech of the user) from the 'EM sensor signal* » (col. 24 lines 29-30). Thus, 
Holzrichter not only lacks elements of the claims, but is not even directed toward 
achieving a result similar to that achieved by the claimed invention. 
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Consider claim 26 Hefzrfchtef teaches a system for removing noise from tfie acoustic 
signals, comprising; 

at least one receiver (see fig 5 (52)) that receives at least one acotwtfc SrgnaT; 

at least one sensor (43) that rweives human tissue vfibistion Information associated 
with human vofdng activity; 

at least one processor (see fig.3b {proceeding electronics)) coupled among the at 
least one receiver and tite at least one sensor (68,43) that generates a plurality of 
transfer functions <56> 57.69X wherein at feast one first transfer function (57) 
representative of the at leas* one acoustic signal Is generated in response to a 
determination that voicing infoimaKon is absent (unvoice) from the at least one acoustic 
signal for at least one specked period of time (such as, time frames)[see eol.28 lin* 38- 
AB\ wherern noise- is removed (removing noise is inherent to speech recognition 

^gorithm to extract the best speech feature end avoid notes) torn the at feast one 
acouatfc signal using the first transfer function to produce at least one denoted acoustic 
data stream (60, see col 15 line29-cot.l6 fine 3 and col. 60 fine 1930), 

Again, Holzricbter never discusses removing noise from acoustic signals. 
Holzrichter only discusses "removing 'post-glottal pressure induced vibrations' (i.e. the 
speech of the user) from the 'EM sensor signal' " (col. 24 lines 29-30). 
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Consider claim 33 Hoferichter teaches a system for removing noise from acoustic 
senate, composing at least one processor (see fig.3b (processing electronics)) coupted 
among at least one microphone (see flg.5 (52)) and at least on© voicing sensor (43), 
wherein the at Beast one voicing sensor (43) detects human tissue vibration associated 
witfi voicing, wherein an absence of voiced information (unvoice) is detected during at 
Jeast one period (such as, time fteroesXsee col.28 fine 38-48) using the at least one 
votef ng sensor, wherein at least one noise source signal is received during the at least 
ona period using the at Jeast one microphone (52), wherein the at least one processor 
generates at least one transfer function (57) representative of the at least one notee 
source signal, where fn the et feast one m!crc>phone(52) receives at least one composite 
signet comprising acoustic and noise signal*, and the et least one processor removes 
the noise signa^remo vfng moiee Is inherent to speech recognition algorithm to extract 
the best speech feature and avoid notee) from the at least one composite sfgnel using 
the at (east one transfer function {57) to produce at least one deno&sed ecousf Ic date 
strean1(60\ see col. 15 line 4-col,16 line 3 end col BO line 1W0). 

Throughout Holzrichter's disclosure, Holzrichter fails to teach or suggest 
removing noise from acoustic signals. Holzrichter only discusses "removing 'post- 
glottal pressure induced vibrations' (i.e. the speech of the user) from the 'EM sensor 
signal' " (col. 24 lines 29-30). This illustrates not only that Holzrichter lacks claimed 
elements, but that Holzrichter is not attempting a teaching suggestive of the claims. 
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Consider claim 35 Holzrichter teaches a signs! processing system (sea flg-Sb, 
(processing electronic)) coupled among at least one user and at least one electronic 
device (see fig.3b, {processing electronic)), wherein the signal processing system 
(processing electronic) includes at least one denoislng subsystem (see fig. 5) for 

removing noise from acoustic signals, the denoting subsystem {Sg,5) comprising at 
least one processor coupled among at least one receiver and at least one sensor (43, 
EM sensor), whereto the at least one receiver is coupled to receive at least one acoustic 
signal;* wherein least one sensor (43) detects humw tissue vibration associated with 
human voicing activity (see coL 15 line 4-1 S), wherein the at teast one processor 
generates a plurality of transfer foncttons (56, 57,59), wherein at least one first transfer 
function (56) representative of the at least one acoustfc signal fe generated fn response 
to a determination that voicing information Is absent (such as, unvotoe) from Ihe at least 
one acoustfc signal far at least on* specified period of time{syc?h as, time framesXsee 
ool-2-B line 38-48X wherein noise Is removed (removing notes fa Inherent to speech 
recognition algorithm to extract the best speech feature and avoid nofse) torn the at 
least one acoustic signal using the -first transfer function to produce at least one 
denoted acoustic date; stream (60, see col. 1 5 line 4~coU6 line 3 and col 60 line 19- 
30). 

Applicants respectfully reiterate that Holzrichter does not contain or suggest 
claimed elements in any. of the claims. For example: 

1 . Holzrichter only uses a single microphone that is assumed to be noise ftee; 
while the claims include least two microphones that are assumed to be noisy; 

2. Holzrichter does not calculate transfer functions between microphones (as in 
the claimed invention)., but only Fourier transforms of clean acoustic data and 
"EM signals' 5 ; 

3. Holzrichter does not detect skin vibrations due to user speech; and 

4. Holzrichter does not use a voice activity detection (VAD) signal to determine 
when to update the transfer functions between the microphones. 
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Rejections under 35 U.S.C. §103 

Claims 4, 19, and 32 were rejected under 35 U.S.C. § 103(a) a$ being 
unpatentable over Holzrichter. Applicants respectfully traverse the rejection. 
The examiner states in paragraph 7 of the Office action: 

7. Claims 4. t9 and 32 an? rejected under 35 U.S.C. 103(a) a$ being unpatented 
over Holzrichter (US PAT. 5,729,894). 

Consider claim 4 Holzrichter teaches implementing a microphone (acousUc sensor, 
e.g. microphone col, 16 tines 14-15: col. 1 1 fines 29-30) coupled to a processor but 
docs not teach implementing a plurality of mfcrophones. At the time of the invention, it 
would have been obvious to one of onJfnery skfll in the art to impfetnent a plurality of 
microphones for flexibility; thus gathering acoustic Information in various areas instead 
of Implementing one microphone in that Is restricted to a certain area 

Consider claims 19 and 32, they are essentially similar to claim 4 and are rejected 
for the reason stated above apropos to claim 4. 

Applicants are uncertain of the meaning of "flexibility". The sentence goes on 
to state 44 thus gathering acoustic information in various areas instead of implementing 
one microphone in that it is restricted to a certain area". From the latter part of the 
sentence, Applicants can only conclude that "flexibility" has nothing to do with the use 
of two microphones as in the claimed system. The claimed system relies on two 
microphones to function. At least two microphones are the minimum number for 
implementing the claimed method and system. One of ordinary skill in the art would 
find no motivation to modify Holzrichter as suggested because Holzrichter does not 
teach or suggest the claimed nethod of processing data from ANY microphone (never 
mind more than one microphone) in the claimed manner or to achieve the claimed 
effect. The disclosure of Holzrichter is simply not enabling of a multiple-microphone 
method or system, and is not even suggestive of such a method or system. For all of 
these reasons, Applicants respectfully submit that claims 4, 19, and 32 would not have 
been obvious in view of Holzrichter. 
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Claim 31 was rejected under 35 U.S.C. § 103(a) as being unpatentable over 

Holzricbter in view of Sugiyama. Applicants respectfully traverse the rejection. 

The examiner states in paragraph 8 of the Office action that 

Holzrichter does not dearly teach the system of further 
comprising: dividing acoustic data of the at least one acoustic 
signal into a plurality of subbands; removing noise from each of 
the plurality of subbands using the at least one first transfer 
function, wherein a plurality of denoised acoustic data, streams are 
generated; and 6 [sic] combining the plurality of denoised acoustic 
data streams to generate the at 
least one denoised acoustic data stream. 

The Office action further states that Sugiyama teaches the system of further 
comprising: dividing (fig. 1 (50)); removing (6); and combining ((8) and col. 1, lines 
12-35). 

Applicants respectfully submit that the proposed combination does not result in 
claim 31 . Sugiyama does not overcome the deficiencies of Holzrichter. For example, 
Sugiyama is not concerned with noise removal at all and so certainly fails to teach or 
suggest transfer functions. Sugiyama (5,5 1 7,435) simply implements a standard 
adaptive filter to do system identification, not noise removal. For all of these reasons, 
Applicants respectfully submit that the invention of claim 3 1 would not have been 
obvious is view of the cited references. 

CONCLUSION 

In view of the foregoing amendments and Remarks, Applicants respectfully 
submit that any objections and rejections have been overcome, and the claims are now 
allowable. Prompt allowance of the application is earnestly solicited. Examiner Lao is 
respectfully requested telephone the undersigned to facilitate resolution, of any issues 
prior to allowance of the application. 



AUTHORIZATION TO CHARGE DEPOSIT ACCOUNT 

If there are any fees due and unpaid in this application, please charge our 
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Deposit Account No. 503616 for these fees. 



Respectfully submitted, 

Courtney Stamford & Gregory LLP 



Date: _ July 23, 2007 
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